Tag

Part:BBa_K5054001

Designed by: José Ángel Fernández   Group: iGEM24_TU-Delft   (2024-07-22)


uTP2 - UCYN-A transit peptide motif combination 21634


Sequence and features


Assembly Compatibility:
  • 10
    COMPATIBLE WITH RFC[10]
  • 12
    COMPATIBLE WITH RFC[12]
  • 21
    COMPATIBLE WITH RFC[21]
  • 23
    COMPATIBLE WITH RFC[23]
  • 25
    COMPATIBLE WITH RFC[25]
  • 1000
    COMPATIBLE WITH RFC[1000]

Profile

Name: uTP2
Base Pairs: 267 bp
Origin: Braarudosphaera bigelowii
Properties: Nitroplast transit peptide


Usage and Biology

The marine microalgae Braarudosphaera bigelowii possesses an endosymbiont nitrogen fixing bacterium, UCYN-A, which has recently been shown to be experiencing genome reduction and shows other signs of organellogenesis. The term "nitroplast" has thus been donned to refer to UCYN-A. Like mitochondria and chloroplasts, UCYN-A does not encode all the proteins it needs to work, rather they are present in B. bigelowii 's genome, translated in the cytoplasm and imported into UCYN-A. This sequence encodes a transit peptide that tags a protein for import into UCYN-A, thus expression of a protein tagged with this part will result on it being imported into the endosymbiont. This can allow for experiments on B. bigelowii in vivo, or for expression of natively imported proteins on transgenic organisms, allowing for the transplantation of UCYN-A into them.

In silico characterization

Building upon the work of Coale et al. [1], we aimed to advance the understanding of uTP by identifying its precise sequence. Starting from the raw proteomics data from [1], we selected 368 proteins expressed by the host and significantly enriched in UCYN-A and performed multiple sequence alignment (MSA). Using the alignment we identified a strongly conserved C-terminal region in many of the imported proteins similar to that reported by [1]. We selected a subset of 206 proteins with highly similar (>60% sequence identity) C-terminal alignments, indicating that these are likely to contain uTP.

Figure 1. Multiple sequence alignment (MSA) of all UCYN-A enriched sequences. The strongly aligned C-terminal region is highlighted (positions 880-1010 in the alignment)

Motif analysis confirmed findings similar to [1], revealing 8 conserved motifs in the C-terminal region (Fig 2). Further investigation of motif co-occurrence and relative positioning uncovered common patterns: two motifs consistently appeared near the start of the C-terminal region at fixed positions, followed by various combinations of the remaining motifs. This arrangement is reminiscent of a potential sub-organellar localization mechanism, where the initial two motifs could target UCYN-A, while subsequent motifs may specify localization within the endosymbiont, as is the case with chloroplast targeting, where a bipartite N-terminal targeting sequence specifies stromal and thylakoidal localization. More research is needed however to investigate this hypothesis.


Figure 2. 5 uTP variations discovered among UCYN-A imported sequences. The left panel shows the relative position of conserved motifs in the different uTP variations, relative to motif #1, which was present in all examined sequences.


We investigated the relationship between transit peptide (uTP) sequences and the functional core of proteins, known as the mature domain. The mature domain is the part of a protein that remains after the transit peptide is cleaved off and performs the protein's primary function. Given the observed diversity in uTP sequences, understanding their connection to specific mature domains is crucial for designing effective uTP constructs for future experiments. Certain uTP sequences may only be compatible with specific proteins, so to explore potential correlations between uTP motif patterns and mature domain sequences, we trained classifiers to predict the appropriate uTP sequence (by predicting the correct combination of motifs) based on a given mature domain sequence. The classifiers were evaluated using a permutation test [2], with 3 of them yielding statistically significant results (p < 0.05)

Figure 3. Classification results of 4 different classifiers trained to predict uTP sequences based on mature domains.


For in vivo characterization, we constructed candidate uTP sequences by concatenating the consensus sequences of discovered motif patterns. The uTP sequence classifiers were used to select the correct motifs for the fluorescent proteins we planned to use, mVenus and mNeonGreen. The two sequences with the highest confidence values (uTP1 and uTP2) were selected for in vivo experiments. To further validate the constructed sequences, their predicted structure was examined. Structural prediction was performed on all 206 selected uTP-containing B. bigelowii proteins, to uncover the 3D conformation of uTP. The predicted structures were aligned and a consensus structure was created by averaging the aligned regions. This revealed a highly-conserved (stdev per residue position < 1.8Å) structural region with 2 alpha-helices arranged into a U-bend (Fig 5). The structure of constructed mNeonGreen and mVenus + uTP1, uTP2 sequences was predicted and the consensus structure aligned onto them, yielding good alignment (RMSD<=4.0Å), confirming that our constructs will likely behave similar to native uTP-containing proteins.

Figure 4. Structural predictions. (a) Consensus structure of all uTP sequences extracted from UCYN-A imported proteins. (b) Consensus structure of all uTP sequences with charged residues shown (red=negative, blue=positive). (c) Consensus structure aligned onto uTP1 + mNeonGreen construct (RMSD=4.00Å). (d) Consensus structure aligned onto uTP2 + mNeonGreen construct (RMSD=3.77Å).


In the case of mitochondrial and chloroplast imported proteins, there is a well-known overlap: numerous proteins have twin transit peptides or ambiguous targeting sequences targeting both organelles. To investigate whether there is a similar overlap between the potential UCYN-A import system and other known cellular transport systems, we used established protein localization prediction tools on the potential list of UCYN-A imported proteins. These predictions proved to be inconclusive. A large minority (28%) of them were classified as secreted (Fig 4). This suggests that the UCYN-A import system, similar to other protein transport mechanisms, might be related to the Sec system. We also investigated both sequence and structural homologs of uTP in public databases (NCBI, PDB, AlphaFold/Proteome) and found no significant matches.


Figure 5. Average predicted confidence for different localization categories.


The inconclusive homology search and localization prediction results, together with the fact that in most cases transport signals are found on the N-terminal of proteins [3, 4, 5] as opposed to the C-terminal in the case of uTP, suggest the protein import machinery associated with uTP is quite distinct from other known systems in the cell.

References

[1] Coale, T. H., Loconte, V., Turk-Kubo, K. A., Vanslembrouck, B., Mak, W. K. E., Cheung, S., Ekman, A., Chen, J., Hagino, K., Takano, Y., Nishimura, T., Adachi, M., Gros, M. L., Larabell, C., & Zehr, J. P. (2024a). Nitrogen-fixing organelle in a marine alga. Science, 384(6692), 217–222.https://doi.org/10.1126/science.adk1075

[2] Ojala and Garriga. Permutation Tests for Studying Classifier Performance. J. Mach. Learn. Res. 2010

[3] Dong, C., Shi, Z., Huang, L., Zhao, H., Xu, Z., & Lian, J. (2021). Cloning and characterization of a panel of mitochondrial targeting sequences for compartmentalization engineering in Saccharomyces cerevisiae. Biotechnology and Bioengineering, 118(11), 4269–4277. https://doi.org/10.1002/BIT.27896

[4] https://parts.igem.org/Part:BBa_K4806014

[5] Albiniak, A. M., Baglieri, J., & Robinson, C. (2012). Targeting of lumenal proteins across the thylakoid membrane. Journal of Experimental Botany, 63(4), 1689–1698. https://doi.org/10.1093/JXB/ERR444


[edit]
Categories
Parameters
None